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Benbow, Camilla Persson and Stanley, Julian C. CONSEQUENCES IN HIGH SCHOOL 
AND COLLEGE OF SEX DIFFERENCES IN MATHEMATICAL REASONING ABILITY:. A 
LONGITUDINAL PERSPECTIVE. American Education Research Journal 19: 
598-622; Winter 1982. 

Abstract and comments prepared for I.M.E. by GRACE M. BURTON, University 
of North Carolina at Wilmington. 

1 . Purpose 

This five-year longitudinal study was designed to determine what sex 
differences emerged in a sample of mathematically precocious youth from 
the time they were identified in grade 7 or 8. 

2 . Rationale 

in six of the Johns Hopkins talent searches, there was a large sex 
difference in mathematical reasoning ability.' Despite lack of differences 
in SAT verbal scores between the sexes, boys performed at a significantly 
higher level than did girls. While much of the sex-related literature 
suggests that differences will not be apparant when course-taking behavior 
is controlled, the sex differences were significant at the seventh-grade 
level, when the course-taking history of boys and girls is identical. The 
authors hypothesized that the male superiority evident at the seventh-grade 
level would increase during the high school years and would be at least 
partly accounted for by the early advantage in mathematical reasoning as 
evidenced on the SAT-M. 

3. Research Design arid Procedures 

Participants in the first three talent searches of the Study of 
Mathematically Precocious Youth (SMPY) who scored at least 390 on the 
SAT-M or 370 on the SAT-V as seventh and eighth graders were sent a 
questionnaire four to five years after they took the tests. Four yearly 
"waves" of questionnaires were sent* the first in December 1976. The 
final response rates, after two follow-up procedures were employed, ranged 
from 90 to 94 percent, resulting in a total sample of 1,966 students. The 
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sex composition of the final sample (62% male) was not significantly 
different from that of the population (61% male). While there were no 
differences in mathematics or verbal scores between male respondents and 
non-respondents, female non-respondents were significantly less mathe- 
matically able than their ^ responding peers. The data were analyzed 
separately for waves one and two. A third analysis was done on waves three 
and four combined. A variety of SPSS produces were employed. 

Effect sizeswere calculated and classified as small, medium, or 
large (Cohen* 1977). Only large effects were considered important. 

4. Findings 

At the end of high school the sex difference in SAT-M scores in favor 
of boys was significant at the .001 level and, as measured by effect size, 
of importance. A significant difference in SAT-V scores favoring females 
in the second wave of the talent search had dissappeared by the time of 
the follow-up study. There were no significant sex-related differences 
in verbal scores by the time of the high school administration of the SAT. 
Seventh grade SAT-M was the best predictor with respect to courses in 
mathematics taken during high school* The significant difference (p < -001) 
in favor of males, because the effect size was small, was not considered 
important. Girls reported receiving somewhat better grades (p < .001) 
than did boys. The effect in this case was deemed important. Boys tended 
to take courses earlier during their high school careers and took calculus 
more frequently than did girls. These factors were rated as important. 
Only small amounts of variance in mathematics course-taking could be 
accounted for by family background. The best predictor was a retrospective 
one* having rated mathematics as a favorite course in high school. Liking 
for mathematics at the seventh-grade level was not a strong predictor, 
nor was sex or ability. 

More boys than girls took the College Board Math Level 1 test. While 
the difference was significant (p < .01), the effect was small. Male 
scores, however, on the more difficult Level 2 test were importantly . 
larger. The ratio of males to females taking these or the AP examinations 
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was sometimes as large as 3 to 1. Differences in participation in mathe- 
matics contests during high school were in favor of males, significant, 
but* not important. 

No significant differences in reported college major were disclosed. 
There were significant, but not important, differences in favor of males 
in number of students electing mathematics in their first college semester. 

Asked to rank their liking of mathematics on a single global scale, 
males and females responded similarly. Girls, however, were more likely 
than boys to prefer verbal areas in high school, 

5, Interpretations 

Socialization appears not to be the explanation for sex differences 
in mathematics. Differences in mathematical reasoning ability are found 
.in mathematically gifted youth as early as the seventh grade, when they 
could not be the result of differential course-taking. These differences 
have an effect on mathematics achievement, 

Male superiority in mathematics reasoning ability might be due to the 
fact that mathematically gifted males are developing intellectually at a 
faster rate than are mathematically gifted females. Mathematics course 
grade differences in favor of girls may be explainable by the better 
conduct and demeanor often found in female students. The fact that 
females take fewer mathematics courses despite these better grades may be 
due to stronger liking than is true for males of verbal areas. 

Twelfth grade appears to be too late for intervention efforts designed 
to increase female participation in mathematics. Planners of these inter- 
vention efforts should take note that mathematical reasoning ability may 
be a more important predictor of mathematics achievement than is attitude 
towards mathematics. Factors contributing to the differences in achieve- 
ment have still not been isolated. 

Questions remain as to the degree to which the findings which were 
disclosed in this select group of mathematically talented youth can be 
generalized to other populations. 
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Abstractor 1 s Comments 

This ambitious (and expensive) study provides data for considering . 
the mathematical course-taking behavior of a small but special group of 
students — the mathematically precocious. The study was reported at great 
length, arid extensive tables were included for those who wish zo delve 
more deeply into the responses of the nearly 2,000 students in the sample. 

The authors chose to use two criteria for reporting "significance 11 
— first, the usual significance levels expressed as probabilities; second, 
effect size (Cohen, 1977). Fuller explanation of effect size, since it fc 
played a major role in the report of findings, might have been included in 
the text. 

Benbow and Stanley enter, not for the first time, an area of hot 
debate: with respect to sex differences in mathematical ability, is it 
Nature or Nurture? They state that a satisfactory answer is not yet 
possible, but repeat their 1980 conclusion that "putting one's faith in 
boy-versus-giri socialization processes as the only permissible explanation 
...is premature" (p. 620). They do seem to accept, however, that the 
"ability of males developed more rapidly than those of females" (p. 598). 
It may be a little premature for that conclusion as well. 

This abstractor is not convinced that seventh-grade boys and girls, 
even (especially) in 1972-4, had identical experiences in and out of 
school; True, up to that time they had taken the same number of mathe- 
matics courses, usually from the same teacher and in the same physical 
classroom. There is* however, rib guarantee that the psychological class- 
room was the same for both. Copious research from that time period 
(c.f. Casserly, 1975; Marlow arid Davis, 1976) would suggest that it was 
not. Teacher expectations" (Levine, 1976) and behavior (Good, Sikes and 
Brophy, 1973; Cap lan, 1977) were found to vary according to sex of 
student. Nor was it, at that time, apt to be the case that parents 
(Helson, 1971; Astin* 1975; Rubin, Provenzano and Luria, 1976) or peers 
(Luchins, 1976) provided a sex-neutral environment where. academic 
aspirations or leisure activities were concerned. It is unlikely that a 
girl gifted in mathematics was provided the same encouragement and support 
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as an- equally gifted boy. 

Even, however, if it were proven to be the case that gifted boys are 
innately more mathematically talented than gifted girls, the question 
remains: what shall we do with the finding? The fascinating applications, 
of science that allow future parents to control in part the genetic make- 
up of their children are as yet ineffective in disclosing which babies 
will develop into scientists and mathematicians or into people who will 
use these disciplines to enrich the lives of others. Given this lack of 
data— and the enormous variation in individual capabilities and interests- 
it would seem as Stephen Gould suggests, that imposing a biological value 
upon groups is an irrelevant arid highly injurious enterprise (1980, p. 159). 

We can, on the whole, do little more than nurture talent where we 
find it, and spend currently-scarce resources developing ways to do this. 
Since more academically talented males than females take calculus and other 
advanced courses, could we make these courses more appealing to the female 
cohort? What could counselors, teachers, principals, or business people 
do to make non-required courses attractive and available? How can the 
mystique surrounding the recreational uses of mathematics be neutralized? 
If social pressures— so important during the adolescent years— impact 
differentially on the mathematically gifted student according to sex, is 
this remediable? If indeed boys are better mathematics reasoners than 
girls and by the seventh grade, when does this superiority, which is not 
apparent during the elementary school years (Hooper, 1975), begin? If 
the parental influence on achievement attitudes continues strong (Parsons, 
Adler and Kaczala, 1982), what can be done about it? 

Benbow and Stanley are to be congratulated for helping raise so many 
questions, the answers to which can guide intervention programs. The task 
remains to find thoie answers, develop those programs, and implement them 
effectively. 
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Dreytus, Tommy a hd E i s e t^b e r g , Th e b do re INTUITIVE FUNCTIONAL 
CONCEPTS: A BASELINE STUDY ON INTUITIONS. Journal for Research 
in Mathematics Education 13: 360-380; November 1982. 

Abstract and commentary prepared for I.M.E. by JOE GAROFALO , 
Indiana University, Bloomington. 

1. Purpose q 

"This study aimed to assess che intuitive background of junior 
high school pupils as they developed the concept of function 11 
(p. 361). For the purpose of this study "the term 'intuitions' is 
taken to refer to mental representations of the facts that appear 
self-evident" (p. 360). 

2. Rationale 

The authors believe that intuitions play an important role 
in the understanding of mathematics and should be taken into account 
by teachers and curriculum developers. They feel that "the teaching 
process should be based on the intuitive knowledge of the learner, 
especially at the stage when a new topic is approached" (p. 361). 
The specific topic chosen for this study was the concept of 
function and a number of its subconcepts (i.e., image, preimage, 
extrema, growth, slope). Since intuitions are the result of 
personal experience, it cannot be expected that ail students 
will have the same intuitions about functions. If instruction is to 
be intuitively based then "there is a need to assess first the . 
basic intuitions and experiences various student populations 
have with functions ..." (p. 366). 

3. Research Design and Procedures 

Three versions of a 42-item multiple-choice questionnaire 
booklet were constructed. Each version contained both a concrete 
function and an abstract function. The three versions contained 
the same functional relationships, but differed in setting - 
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either diagram, table, or graph. The breakdown of items for each 

was: five image, five preimage, and five extreiria questions for 

both the concrete and abstract functions; five growth questions about 

the concrete function; five slope questions about the concrete 

function; and three slope questions about the abstract function. As 

a validity check, all included questions were classified by 

sub concept by at least four out of a panel of five high school and 

college mathematics teachers. KR-20 reliability was estimated 

at .91 for the full test and at .86 and .81 for the concrete and 

abstract subtests, respectively. 

The three versions of the questionnaire were randomly 
distributed by teachers within 24 coeducational classes in grades 
six through nine in 12 different schools "at the beginning of the 
school year when none of the classes had yet studied the concept of 
function" (p. 369). Students in grades eight and nine had studied 
a unit oh Cartesian coordinate systems in grade seven. Schools and 
classes were chosen to ensure homogeneous distribution over grade 
level and over an ability-social level variable labelled "absolv." 
(Students were classified high or low- absolv according to their 
ability level and the percentage of disadvantaged students in 
the school they attended.) "In summary, each pupil was assigned four 
characteristics, Grade (6, 7, 8, or 9), Absolv (high or low) , 
Setting (D, 6, or T), and Sex (F or H)" (p. 369). 443 students 
completed at least 90% of the questionnaire and were included in the 
analysis . 

4. Findings 

A four-way analysis of variance, using the mean score on the 
total questionnaire as the dependent variable, yielded the following 
significant effects: grade, absolv, setting, grade x absolv, 
absolv x sex, and grade x absolv x sex. Further analyses revealed 
the following: 
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Although in general performance increased with grade* ther 
was a significant decrease between grades seven and eight. 
High-absolv students outperformed low-absolv students at 
all grade levels, but the main progress for the high- 
absolv students came between grades six and seven* while 
that for low-absolv students came between grades eight and 
nine. 

The diagram setting presented more difficulties than the 
other settings at all levels of grade and absolv. The s 
high-absolv students outperformed the low-absolv students 
on all settings, but the high-absolv students preferred 
the graph setting while the low-absolv students preferred 
the table setting. 

Even though the overall performance difference between 
boys and girls was non-significant , boys outperformed 
girls in grades six and seven while girls outperformed 
boys in grades eight and nine. Boys 1 (mainly low- 
absolv) performance dropped considerably between grades 
seven and eight, while that of girls remained relatively 
constant. Overall, high-absolv boys outperformed high- 
absolv girls while low-absolv girls outperformed low- 
absolv boys. However, in grade nine, high-absolv 
girls outperformed high-absolv boys. 
Performance on the concrete and abstract subtests 
paralleled performance on the full test. All significant 
effects on the full test carried over to the concrete 
test, while ali except setting and grade x absolv x 
sex carried over to the abstract test. The trends 
observed on the full test carried over to both of the 
subtests. 

Image questions were answered best, while slope questions 
were answered worst . For slope questions, both high and 
low-absolv students preferred the graph setting, but for 
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all other questions the tiigh-absolv students preferred the 
graph s'ettih^gi while the low-^absolv students preferred the 
table setting. 



5. Interpretations \ 



The authors reached the following conclusions: 

1. Pupils' intuitions 1 on functional concepts do grow with 
their progress through the grades. 

2. No differences in the intuitions between boys and girls 
in junior high school were observed. However * there are 

j _„ _ v _ _ i _ 

indications that girls, tend to develop their intuitions 
at a different rate f roi_q boys. 

3. High-absolv pupils demonstrate correct intuitions more 
often than low-absolv pupils. * 

4. It is not true that intuitioris in concrete situations are 
more of ten correct than in abstract ones. 



-_bstrae£of--*s Covman^s- 

For years scientists and mathematicians \]tiave written about how 
their intuitions have contributed to the discovery and development 



of many of their ideas. Indeed, the history of "science is filled 

with episodes where intuitions have led to significant findings. 

• } K _ _ ... 

Ma thema t i c ians arid philosophers of mathematics havev discussed the 

relationship between intuitions and mathematical reality and many 

great mathematicians have advocated an intuitive approach to 

mathematics instruction. Experienced mathematics educators arid 

teachers are well aware that many topics are best taught through an 

approach that builds upon students' intuitive and common sense 

ideas rather than through a more formal approach. Unfortunately, it 

is the case that what is intuitively evident to one student may not 

be to another. Also, it is not clear how intuitions develop nor 

how they can best be utilized. Research on the nature* development, , 

and role of intuitions is needed if we are going to capitalize on them 

in our teaching. 
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Unfortunately, the notion is even more difficult to study 
than it is to define. In this study the researchers have only 
indirectly tapped the intuitions they set out to assess. 
Multiple-choice tests do not seem to be the most c appropriate way. 
to study mental representations of self-evident facts. It seems 
that what the researchers found out was whose representations led to 
more correct answers. It would have been more interesting arid 
fruitful if they had: (I) looked at how students 1 pre-instructionai 
interpretations of the diagrams, graphs, arid tables led to their 
making sense and extracting information from them and (2) 
related these ideas arid behaviors to the concept of functional 
relationship. In this way, explanatory information might have- 
been gathered bearing on questions such as: "Why did the high- 
absolv students prefer the graph setting while the iow-absoiv students 
preferred the table setting?" arid "Why did the performance of 
low-absolv boys drop so drastically between grades seven and eight?", 
etc. 

In any assessment like this one, it is very improtant to 
look at group differences, but in this regard the variable 
"absolv" has some shortcomings. Not only does it lack a clear 
analog in other settings, but some of its classifications are 
troublesome. For example, high-level students in schools with 
over 80% disadvantaged students were classified "low" while low- 
level students in schools with under 20% disadvantaged students were 
classified "high". If possible, it would have been preferable to ° . 
classify students on the basis of some standard measures of 
mathematical and other cognitive performance to allow for clearer 
interpretation and generalization. 

The authors hypothesized that performance on the concrete 
function* questions would be better than on the abstract function. 
I would have hypothesized the same. I wonder whether the 
inclusion of both on the same test had any affect on the end results. 
Did the concrete questions serve as hints to the abstract questions? 
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If a follow-up study is done, it would be interesting to look 
at concrete/abs tract differences if there were separate tests. 
This could be done by giving each student only one of the two 
versions or by giving all students both versions with the order 
of administration counterbalanced^ 

It is very helpful for teachers. to have information concerning 
their students 1 pre-instructional background on various topics. 
This; study revealed some beneficial findings concerning students 1 
untutored knowledge about functions. 
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Edge, Douglas and Ashlock, Robert B. ; USING MULTIPLE EMBODIMENTS 
OF PLACE VALUE CONCEPTS. Alberta -Journal of Educational Research 
28: 267-275; September 1982. 



Abstract and comments prepared for I.M.E. by LOYE Y. "MICKEY" HOLLIS 
University of Houston-University Park. 

1. Purpose 

The purpose of the study was to determine if using multiple 
embodiments rather than a single embodiment of concepts related 
to th^ee-digit numbers resulted in greater understanding of 
selected place value concepts. 5 

2. Rationale * : 
Educators and learning psychologists typically recommend and 

encourage the use of manipulative materials in teaching mathematics. 
This theory is supported by a number of research studies.;. 

Questions concerning the use of more than* one material to 
teach selected mathematical concepts are often raised. Some 
mathematics educators believe presenting multiple embodiments of 
a concept will increase the student's ability to generalize the 
concepts and to riot associate the concept with any one particular 
embodiment. Research studies on the use of multiple embodiments 
do not agree on their Value. 

3. Research Design and Procedures 

The subjects were selected from 50 middle-class students 
enrolled in the second grade. One student was initially eliminated 
and the remaining 49 were randomly assigned to one of two treatment 
groups. The final sample was two groups of 2i students, due to 
some students being absent too much and the need for equal-sized 
groups. 

This study employed a 2 x 4 factorial design, with repeated 
measures on the time dimension. Measures were taken on Days 6, 10, 
13, and on Day 20 following a seven-day retention period. 
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Both groups were taught by the principal investigator. Coffee 
stirrers (prebundled in hundreds, tens, and ones) , Dienes 1 
base-ten blocks (flats, longs, and units) arid chips from "Chip 
Trading Activities" (green, blue, and yellow), were used with the 
multiple embodiment treatment. The only physical model used 
with the single embodiment treatment was the coffee stirrers. 

There was a total of thirteen treatment periods of 30 
minutes. The order in which the two groups were taught was 
rotated. so that each group was taught first on alternate days. 

4. Findings . 

There was no difference between the overall means of the two 
treatment levels. The time factor was significant beyond the 0.01 
level and indicates there was a significant trend over time with 
the repeated measures dimension across both treatment levels. 
The interaction of Time-by-Group , however, was not significant, 
an indication that the growth pattern over time of the two 
treatment groups was roughly the same. 

5. Interpretations 

One possible interpretation of these results is that it does 
not make any difference whether one uses three concrete exemplars 
or drily one to teach these numeration concepts to second-grade 
pupils . 

In summary, given the possible interpretations of the results 
of this study, it cannot be stated categorically that one should or 
should riot use multiple embodiments to teach selected decimal 
numeration concepts to children in Grade 2. What must be noted, 
however , is that this study, like several others, does call into 
question the advantage of using multiple embodiments of a 
mathematical concept in the instructional process. 
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Abstractor Comments 

A, 

The researchers noted that sensitivity of this measuring 
instruments, instructor enthusiasm, length of treatment, or 
previous experience of the subjects might have affected their 
results. One thing that was not noted was the number of subjects. 
An "N 11 of 21, even with repeated measures, is small when there may 
be other factors at work. 

A question could be raised about the choice and/or number 
of physical F~dels used with the multiple embodiment treatment 
group. The pfebundled coffee stirrers and the Dieties 1 base-ten 
blocks are very similar, especially when ,used as they appear to have 
been used in this study. The use of one of thesis combined with 
the "Chip Trading Activities" might have proved more profitable. 
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Evertson, Carolyn M; DIFFERENCES IN INSTRUCTIONAL ACTIVITIES IN 
HIGHER- AND LOWER-ACHIEVING JUNIOR HIGH ENGLISH AND MATH CLASSES. 
Elementary School Journal 82: 329-350; March 1982. 

Abstract and comments prepared for I.M.E. by JOHN ENGELHARDT, 
Southern Oregon State College. 

1. Burnose 

Both higher- and lower-achieving classes of junior high English- 
and mathematics teachers were studied in order to observe instructional 
strategies which differed between the achievement groupings. 
Within this study a subset of teachers was selected and the present 
article focused on the narrative data as well as inferential 
statistical data of this subgroup. 

2. Jtafcionale 

The author points to lack of specific research-based suggestions 
on how to differentiate instruction for ability groups. Given that 
classroom mangement is related to classroom composition (Doyle, 
1979), it would be helpful to know which specific techniques 
worked for different class compositions. 

3. Research Design and Procedures - v 
A sample of 51 teachers (25 English, 26 mathematics) from 

eleven junior high schools in a large southwestern urban district 
were observed in two of their cxass sections. Observers were 
trained in writing narratives focusing on management and organization^ 
in rating student engagement (on task, off task, and shades in 
between), in rating specific components of the overall>classroom 
behavior (44 specifics), and in maintaining time logs of the various 
classroom activities. Teachers were observed from 14-20 hours in 
each of two classes during the school year, with roughly half the 
observations in the first three weeks of school. A total of 1400 
observations of one-hour duration was taken. ; * 
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California Achievement Tests (CAT) frotn the previous year were 
used as covariates in measuring academic progress, with specially 
constructed achievement tests in mathematics and English administered 
at the conclusion of the school. year after all observations were 
completed. Student attitudes toward school, instructor, and class 
were assessed prior, to achievement testing using a form adapted from 
the Student Rating Scale :of Instructors (Stalling?, Needels , and 
Stay rook, 1979) - 

To assess observer reliability, pairs of observers were sent 
to classrooms on 23 occasions and complete data sets were checked. 
The only problem encountered seemed to be some deletion of narrative^ 
material. Regular meetings were held with observers to maintain 
consistent understandings. Be tween-observer agreements of component 
ratings were reported at p < .12 and student engagement ratings at 
p < .001. Complete information regarding these assessments can 
be found in Evertson et al. (1980). 

A subset of the 51 teachers was selected for further study, 
these teachers (6 mathematics, 7 English) were determined by the 
fact that their two classes differed by two or more grade levels 
in mean entering achievement (based on the CAT). Teachers' 
low- ability classes had an entering mean of 2.8 grade levels below 
placement, while high- ability -classes were 0.4 above grade placement 
on the average . 

Two-way ANOVA with subject matter a between- groups factor and 
ability level a within-groups factor was used to examine differences 
between higher- and lower-ability classes. 

4. Findings 

Subiect matter differences revealed (p < .05) that English 
teachers were rated higher in occurrence of verbal class participa- 
tion, nurturing student's affective skills, and relating content to 
pupil interest and background. Ability differences revealed 
(p'< .05) that higher-ability teachers were rated as nurturing 
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affective skills and maintaining a task-oriented focus. 
Lower-ability teachers had significantly more disruptive arid/or 
inappropriate behavior and had more conferences to stop this 
bebivior. The author reported other results with higher p values - 
(.06 < p < . 11) . 

Higher-ability classes across both subjects had a larger 
percentage of students on task (p = .05). Higher-ability classes 
also had more transitions than lower-ability classes, but the 
average length and total transition time were greater (p < .05) 
in lower-ability classes. 

Evaluation of narrative data on mathematics classes revealed 
that teachers did hot vary much in activity pattern either among 
themselves or between ability levels. There was no real differ- 
entiation of instruction across higher- arid lower-ability classes. 
The pattern observed was essentially the following: opening 
(mostly procedural) , checking and grading, lecture/discussion, 
seatwork, close. Additionally, the time allocated for these 
components did hot differ significantly across teachers or 
levels of ability. 

The author reported on a case study of the management 
techniques of two mathematics teachers in their lower-ability classes. 
Teacher B was labeled as "reactive 11 and teacher F as "proactive." 
Teacher B was plagued by inappropriate. behavior which disrupted 
her attempts to help' students. Teacher F, whose lower-ability 
class had the highest residualized achievement scores of the six 
mathematics teachers, differed in how he structured activities. 
He allowed more time for checking and discussion (13.7 minutes-F, 
6.5 minutes-B) and lecture and seatwork introduction (14.4 minutes-F, 
8.7 minutes-B) with substantially less time for seatwork (22.5 
minutes-F, 35. 1 minutes-B) . He incorporated seatwork practice 
into his lecture, thereby distributing student involvement with 
the material and allowing for more immediate feedback. This 
resulted in higher task orientation for the class, as noted by 
observers . 
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5.* Interpretations 

Analysis of quantitative and qualitative data revealed that: 

1) Lower-ability classes, although smaller in number, are harder 
to manage and keep bh task than higher-ability classes. 

2) Teachers did not differentiate instructional approaches for 
ability levels. 

3) There are ways of providing instruction in low-ability 
classes to increase productive use of time and student 
involvement. 

4) Active teaching (proactive) can be a useful way to view 
instruction. 

. Abstractor * s Comments 

' • ' - _ / 

This study contained a number of elements of thorough 

classroom research. The sample size was quite large, the data 
collected were quite detailed, and the length of study was of 
sufficient duration to justify belief £n the findings. However, 
no mention was made as to how the sample was selected, so assump- 
tions of randomness arid implications of results for a population 
larger than the sample are questionable. 

A small point, but one worth mentioning, was the consistent 
use of ability groups when in fact the students were grouped by 
achievement. Correct usage was made in the title and promptly 
abandoned. . Little mention was made of the p^osttest, which was 
specially constructed. It is interesting to note that of the 12 
mathematics classes in the substudy, five of the six lower- 
achievement classes and three of the six higher achievement 
classes had negative residual achievement scores. It is riot 
clear how to interpret this. In addition, no units were reported 
for the posttest. It also seems that the CAT was used both as 
a covariate measure and as a high/low grouping measure which is 
not entirely appropriate statistically. 



24 



20 



The richness of the narrative data was much appreciated by 
this reviewer, who found the case study report quite interesting. 
It laid some foundation for further Exploration in search of 
effective instructional strategies for achievement groupings. 
What struck this reviewer was the surprising qualitative, differences 
between teachers B and F, along with the corresponding lack of 
quantitative differences. Teacher B T s residual achievement 
score was -.09, with 75 percent on-task academically and only 5 
percent off- task. Teacher F had a residual achievement of .06, 
with 85 percent on-task academically and 6 percent off-task. This 
perhaps points up the significant value of narrative data in 
trying to ascertain what goes on in the mathematics classroom 
aside from the typical quantitative measureables. 

Aside from the criticisms mentioned above, this reviewer 
thought highly of the study. Several thought-provoking and 
distrubing questions arise. Are we as mathematics teachers 
so set in our pattern that we j f ail to differ not only from each 
other, but for the students we teach? Are we so inclined to moid 
the student to .our style rather, than looking for ways to adapt? 

Proactive teaching or direct instruction has research grounding 
(Good and Grouws, 1978, 1979) as an effective way .to teach 
mathematics, especially to lower-achieving students at the 
intermediate grade level, but lias not been as pronounced in 
success at the junior high level. Perhaps more study will bear ' 
out Teacher F T s style as an appropriate way to differentiate 
instruction. Clearly something needs to be done to, effect more 
academically productive time for students in these classes. 
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Evert son, Carolyn M. and Emmet, _ Edmund T. EFFECTIVE MANAGEl^ffiNT AT THE 
BEGINNING OF THE SCHOOL YEAR IN JUNIOR HIGH CLASSES. Journal of Educational 
Psychology 74: 485-498; August 1982. 

Abstract and comments prepared for I.M.E. by CHARLEEN M. DERIDDER, Knox 
County Schools, Tennessee. 

1 . Purpose ^ 

The purpose of this year-long study was to identify and assess 
beginning-of-the-year management practices of groups of junior high school 
teachers of English and of mathematics that were selected and categorized 

_ __ . V 

from the data gathered as more effective and less effective classroom 
managers. The study sought to answer the question of how more- effective 
managers differed, it at all, from less effective managers in their manage- 
ment procedures during the first three weeks of school. 

2- Rationale 

The authors quote educators who point out the importance of management 
skills and note the correlation of management variables with student 
achievement gains. Several studies were cited that have contributed to 
the body of information on this subject. Most such studies have been 
short-term and cross-sectional. It was conjectured that a longitudinal 
study, such as this one, might provide a more adequate base for suggesting 
how to initiate classroom behavior to promote long-term management 
effectiveness . 

3. Research Design and Procedures 

Essentially, the study first accumulated data on 26 mathematics and 
25 English teachers for a three-week period. These teachers were 
volunteers from 11 different junior high schools in the southwest. The 
study included three-fourths of eligible experienced teachers and one- 
half of eligible first-year teachers. These data were set aside, and then 
data were continued to be gathered on these teachers. At the end of the 
school year, four subsets of teachers (six more effective, six less 

26 
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effective managers among mathematics teachers and seven more effective* 
seven less effective managers among English teachers) were determined on 
the basis of the data collected after the first three weeks. Analysis 
was then made of the data collected on these groups during the first three 
weeks . 

The data gathered involved" the comparison of means of more effective 
and less effective managers with respect to percent of students bn-task,' 
percent of students off-task* observer management factor, residual student 
achievement, and student rating of :teacher r Selection of the subsample of 
the four teacher groups was based on computing and summing across these 
criteria which provided a composite management effectiveness ranking. 
These teachers taught classes that had similar average achievement levels. 

The procedure of the study began with the work of 18 trained 

observers in the classrooms of these 51 teachers. 

The first three weeks. Each teacher was observed oh 
first^ second and fourth day, then three or four times the 
next two weeks in one certain class! Each was also observed 
four or five times in a second class during the second two 
weeks . * * 

__^ e _ re ??^ n ? Tl S_ school year. Observers were- reassigned to 
observe different teachers .They observed each -of two classes 
per teacher every three to four weeks . 

The observation data included classroom narrative records based on a 
set of 42 guideline questions. Observers dictated a record from their 
notes into audiocassettes ; a typical narrative was seven to ten pages 
long. A time use log was compiled on each teacher. A record of Student 
Enga gement gates (SER) was kept which described students as on- or off-task 
in academic or procedural activities for each class session. After each 
observation, the observer rated, oh. a five-point scale, selected managerial, 
instructional, and behavioral characteristics. These data are labeled 
Component Ratings (CR), and consist of 36 items which describe teacher and 
student behavior. 

Narrative Ratings (NR) were compiled for a teacher f s first three 
weeks based on the classroom narratives. The project staff made summary 
ratings of 29 behaviors and characteristics based on procedures used in 
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an earlier study of elementary teachers. 

California Achievement Test score data from the preceding spring 
were used to determine class/ means as a basis of entering achievement 
levels. These means were also used as a predictor when computing residual 
student achievement. At the end of the year, students were tested in 
English and mathematics by tests which reflected the content of tHe 
district-wide adopted textbooks. 

Once the subsample was identified, the focus of the study was an 
attempt to determine management behaviors which characterize more effective 
managers as compared to less effective managers during the first three 
weeks of school. The data gathered during the first three weeks were then 
analyzed in a variety of ways. 

Student Engagement Rates of the four groups were compared using two- 
way analyses of variance. In terms of Component Ratings, the average 
rating on each variable was computed across observations and a series of 
two-way ANOVAs (more vs. less effective, mathematics vs. English) was run. - 
Next the Narrative Ratings were analyzed using a series of two-way ANOVAs . 

Additional analyses were performed to address several questions, such 
as, were there initial differences in student behavior in the classrooms 
of the more and less effective managers? u 

Reliability checks of the observation variables were performed u^ing 
both between-observers agreement and between-periods stability coefficients. 
The reliability of the achievement and attitude measures were determined 
using internal consistency coefficients. 

There was also a summary made of the correlations between each 
criterion and CR or NR variables which shows consistency across assess- 
ments, residual achievement, observer management factor, off-task, and 
academic on-task. One exception was the residual achievement criterion 
in English which showed weak correlations . 

4. Findings 

The answers sought by this study were to the questions of whether and 
how more effective and less effective managers differed in their behavior 
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at the beginning of the year. The researchers identified significance of 
results at the 10%, 5%, and 1% levels. The results of the SER variables 
indicated that more effective managers in both English and mathematics had 
high on-task rates at the 5% level; lower off-task, unsanctioned behavior 
rates and less dead time at the 10% level. 

In terms of Component Ratings > there was a significant difference be- 
tween more effective and less effective managers with respect to 16 cf 36 
items. Four, of those were at the 10% level, seven at the 5% level,' and five 
at the 1% level. These items included clarity in giving directions > stating 
desired attitudes, presenting clear expectations for work standards, and 
consistency of response to inappropriate behavior. More effective English 
teachers, , (but not mathematics teachers) were rated higher than less effective 
teachers on the variables of describing objectives clearly, using materials 
that effectively supported instruction, and using and encouraging analytic 
processes . 

Results obtained in the comparison of Narrative Ratings indicated 
significant differences in 22 to 29 items. One was at the 10% level, 13 at 
the 5% level, and eight at the 1% level. These items included instructional 
clarity and coherence, regular academic feedback to students, effective 
monitoring of student work* effective intervention to stop students from 
avoiding tasks, frequency of unsolicited call-outs (less for the more 
effective managers) , and social talk among students during seatwork and L 
lecture. Only a few subject matter effects and interaction affects were 
noted. 

5. Interpretations 

There are several broad themes indicated by clusters of variables 
differentiating more or less effective managers. More effective managers 
were more successful in teaching rules and procedures to students. These 
teachers were more attentive to and more immediately responsive to undesir- 
able student behavior. They were more consistent in maintaining the rules 
of classroom procedure. -.More effective managers rated considerably higher 
than less effective managers in their ability to maintain student responsi- 



29 

ERLC 



25 



bility for productive; use of time. More effective managers were more 
successful in those variables related to communicating information clearly 
to students. Another major area of difference was that of organization of 
instruction. More effective managers had less wasted time, in their activ- 
ities and more time on task. 

Although certain behaviors were identified as antecedent conditions 
in effectively managed classrooms, the conclusion cannot necessarily be 
made that they are causal factors . However, common factors between this 
study and other management research suggest that these behaviors contribute 
to year-long management effectiveness. 

Abs^tr ac to r * s Gommen t s 

This detailed, in-depth^ comprehensive study was supported in part by 
the National Institute of Education. ' The amount of time and effort expended 
in this study is certainly impressive. The work of the classroom observer 
required that he/she make notes based on 42 guideline questions^ tabulate 
time use by students every 15 minutes, and complete a 36-item rating of 
teacher and student behavior on a scale of 1 to 5 for each class. Computa- 
tion produces the estimate that each of 51 teachers was observed 32 times. 
The narrative record alone for each observation was 6 to 7 pages which means 
some 13,000 to 14,000 pages of data. 

Great care was taken on the part of the researchers to determine the 
reliability of the assessment pro cedures and to guard against bias on the 
part of the observers. The fact thar. this study followed and, to some 
extent, was based oh previous work provides continuity and reinforcement 
of significant variables with respect to effective classroom management. 
Also, the researchers are to be commended for the way selection of more 
effective vs. less effective manager subjects were identified for the study. 

There are some questions and concerns, however, that might be 
mentioned. 

1. Although data tables in the study indicated at which of the three 
levels variables were significant, the authors' discussion of the findings 
was generalized and did not differentiate among results with respect to 
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levels of significance. 

2. Was there any disparity in the socio-economic levels of the students 
in the classes observed in the eleven different schools? If so, was there 
any correlation between this and the identification of the more vs. less 
effective management teachers? 

3. Mention was made that the content of the end-of-year English test 
assessed mainly usage, punctuation, and spelling, ' while writing skills, 
literature objectives, and other communication skills were not addressed. 
It would appear that only lower-level cognitive skills were assessed in 
terms of student achievement. This might suggest that the identification 
of these classroom management skills characteristic of effective managers 
be qualified as those capable of producing student achievement of lower- 
level cognitive English skills. 

4. In a similar way, the nature of the mathematics test requires 
description. Was it primarily a test of computational skills? Were there 
items involving concepts and applications of mathematics or use of problem- 
solving skills? It is conceivable that a teacher in the process of teaching 
problem solving might have students in the Polya phase of trying to "under- 
stand the problem". Such students tend to exhibit apparent off-task, non- 
purposeful behavior while mulling over the problem. Such behavior might be 
labeled "off- task" by the observers in this study. 

>* 5. In compiling the data, the authors list the mean scores of the four 
groups, indicating the differences between more effective and less effective 
management in mathematics and in English. Comparisons are then made by 
grouping the mathematics and English more effective managers together and , 
the mathematics and English less effective managers together. It can be 
observed from the data in the tables that in four of the 16 variables 
found significant, the means of the more and less effective managers in 
mathematics differed by only 0.3 on a scale of i to 5, differed by 0.1 on 
one of these variables, and were identical on still another on the CR 
instrument. On five variables, where no significance was found, the mean 
scores of the mathematics subjects differed from 0.3 by as much as 0.7. 
On the NR instrument, there were two of the significant variables which 
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indicated the means of the mathematics subjects were identical. It .might 
be useful to examine the data for the mathematics subjects separately. 

In .terms of the findings of the sutdy, it would be interesting to know: 

a. Were any of the more effective' managers first-year teachers? 

b. Was the number of years of experience a factor? 

c. Was there a maximum of management effectiveness at any given 
number of years of experience? 

d. Was there a greater incidence of more effective managers from 
any one or more of the eleven schools? 

e. In the same veiri^ was there a greater incidence of less 
effective managers from any one or more of the eleven schools? 

in light of the current tremendous concern for teacher accountability, 
a study such as this certainly has merit. : While research appears to 
indicate that certain classroom management techniques are essential to 
student achievement, perhaps researchers should also give attention to 
that which is being achieved - 9 or that which is recommended to be achieved 
by the educational community. Such findings might suggest a classroom 
climate that would add a "different dimension to the mangement variables 
identified as significant in this study.. 
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Fogarty, Joan t; and Wang, Margaret C. AN INVESTIGATION OF THE GROSS-AGE 
PEER TUTORING PROCESS: SOME IMPLICATIONS FOR INSTRUCTIONAL DESIGN AND 
MOTIVATION . Elementary School Journal 82: 451-469; May 1982. 

Abstract and comments prepared for I.M.E. by ROY CALLAHAN, State University 
of New York at Buffalo. 

1. Purpose 

The study examined the cross-age peer tutoring process, with special 
consideration being given to the social dynamics in the tutoring situation. 
Particular attention was focused on attitudes between tutor and tutee, and 
the motivating effect of the process on the students involved in the 
tutoring process. 

2. Rationale 

Studies have pointed to increased increments of achievement by tutors 
and tutees involved in the tutoring process. To what are such increments 
attributable? An obvious response is that it provided increased time on 
task. However, a number of studies attribute the change to the social 
cynamics involved in the tutoring situation. This study drew heavily from 
the works of Lippett (1976) and Sarbin (1976), who suggested that the tutor- 
tutee relationship is the primary reason for the beneficial impact of the 
tutoring process on achievement. This study attempted to describe the 
attitudinal and motivational factors at play in the dynamic tutor-tutee 
interactions. 

3. Research Design and Procedures 

The study took place in a university K-8 laboratory school. Three 
multi-age groupings were made: primary, intermediate, and middle school. 
It was a two-phase study: (1) remedial mathematics instruction and (2) .: 
computer literacy instruction. Three male and three female middle school 
tutors worked with 11 male and one female tutees from the primary and 
intermediate groups in phase 1 (remedial mathematics) . Six male and no 
female middle school tutors worked with four male and two female inter-. 
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mediate tutees in phase 2 (computer literacy). Tutors were selected from 
volunteers on the basis of interest in tutoring and proficiency in mathe- 
matics. 

Tutors received four 30-minute training sessions prior to their first 
tutoring session. These sessions introduced them to the mathematics 
content and material of the. tutoring lessons and also focused their 
attention on diagnosing tutees 1 problems, finding alternate instructional 
explanations, and providing encouragement to tutees to complete lessons. A 
role-playing procedure was used in these training sessions. 

In phase i (remedial mathematics), six tutors - were randomly assigned 
to work with two tutees, one at a time, for 30-minute sessions twice a week 
for eight weeks. In phase 2 (computer literacy) , six tutors worked with 
six tutees (one-on-one) , pgain for 30-minute sessions twice a week for 
eight weeks . ■ 

During phase 1 verbatim verbal protocols of verbal interactions 
between tutor and tutee were collected. Four ten-minute observations were 
made by two trained, observers during the middle six weeks of the study. 
A blind rater checked protocols recorded independently by the two observers. 
During phase 2, the tutoring sessions were tape-recorded and then transcribed 
to obtain four ten-minute observations similar to phase 1. 

Independent variables examined were: . (i) affective dimensions of 
verbal behavior categorized as to (a) locus of initiative, (b) influence, 
(c) verbal reinforcement; (2) instructional verbal behavior dimensions such 
as explanations, providing examples, or asking questions; (3) student 
learning progress as measured by task completion rates in mathematics , and 
performance on a 20-item multiple choice test on computer literacy; (4) 
tutor-tutee attitudes and perceptions based on interviews of these sets of 
people and their teachers. 

4. Findi&gs 

In regard to the Effective dimensions of verbal behavior, there were 
no significant differences between verbal interactions initiated by tutors 
or tutees. This result indicated that both tutor and tutee take active 
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roles in the tutoring process. In regard to influence* verbal behaviors 
considered "directive" accounted for about 31% of the tutor behavior; 
however, about 50% of the tutee behavior was "directive." This result 
indicated that the tutor-tutee relationship was a give-and-take situation 
with assumption of the "directive" role quite evenly distributed. Approx- 
imately 20% of the tutor behaviors were classified as positive reinforce- 
ment while 12% were classified as negative reinforcement. 

In regard to instructional verbal behaviors, tutors tended to limit 
most of their instructional behaviors to explaining directions , asking 
questions, confirming correct responses, or pointing out incorrect 
responses. A majority of the tutees 1 instructional verbal behaviors were 
related to answering questions from the tutor or instructional materials. 
These results indicated that tutors employed a very restricted number of 
instructional behaviors in the tutoring process. 

As a further descriptive refinement, the study examined the extent to 
which interactions between tutors arid tutees are related to age and sex of 
the tutoring dyads as well as the nature of subject matter taught. Regard- 
ing age, it appeared that with older tutee groups more verbal behaviors 
were initiated by the tutee than the tutor, and there was also significantly 
greater frequency of tutee responses to tutor questions in dyads with 
younger tutees. Regarding sex influences, it was found that a significantly 
greater proportion of verbal behaviors' was Initiated by the tutee rather 
than the . tutor in same-sex dyads when compared with different-sex dyads. 
Different-sex dyads indicated a significantly greater * frequency of tutee 
responses to tutor questions and statements than did same-sex dyads. No 
differences were found in any of the behaviors when the dyads were divided 
according to differing subject matter. 

In regard to student learning progress, the tutees 1 task completion^ 
rates in mathematics were lower than the class rates before the tutoring 
began, surpassed them during the program period, and maintained these gains 
even after the program was completed. With the computer literacy phase, 
there was a significant difference in gain score for the tutee group-when 
compared to a comparison group on age and mathematics achievement. 
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In regard to attitude and perceptions, tutor interviews indicated that 
a majority expressed a positive attitude toward the program and a desire to 
stay with the same tutee. There was some slight tendency indicated to work - 
with a different student— mostly female tutors who requested a change to a 
same-sex tutee. Tutee interviews indicated that a large majority of 
tutored students had positive feelings about the experience. Both "receiv- 
ing" (teachers of the tutees) and "sending" (teachers of the tutors) 
teachers viewed the program as an effective way to provide remedial in- 
struction to slow students. 

5. Interpretations 

' This descriptive study of the interpersonal dynamics at play in a 
classroom cross-age peer tutoring situation involving remedial mathematics 
and computer literacy instruction tends to provide evidence that the 
tutor's role is based more on friendship than on teacher-like authority. 
-The data suggested a situation that could be characterized as a give-and- 
take friendship between peers rather than one where the tutor dominates and 
directs; initiating and directing roles interchanged between tutor and tutee 
with relative equality. The data indicated that the student tutors used a 
Very restricted range of instructional techniques, which suggests that the , 
tutoring "process may be most useful for practicing or reviewing tutee' s 
skills rather than developmental work. The data also suggest .that age and 
sex of students in the tutoring dyad may affect the character of the inter- 
actions in the tutoring process. Same sex and similar age dyads may provide 
an instructional situation based on give-and-take friendship; opposite sex 
and dissimilar age dyads may tend toward more of a teacher-like authority 
characteristic in the tutoring situation. 

** Abstractor's Comments 

For a number of reasons, this is not a very important study. However, 
it does make some minimum contribution to the accumulation of knowledge 
about cross-age peer tutoring as an instructional procedure. 

The formal use of older and more knowledgeable students to- assist 
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younger less knowledgeable students iri schools has quite an extensive 
history. Monitorial schools gained much popularity iri this country during, 
the first half of the 19th century, iri great measure due to the limited 
funding for education. Necessity being the mother of invention - 9 schools 
began to utilize students (mostly boys) who knew a little in teaching the 
others (again mostly boys) who knew less. In a fit of hyperbole, the English 
educator ^ Lancaster - 9 whose name the system popularly assumed, wrote; MThe 
system spread from Thames to Ganges; it has encircled the equator; it has 
encompassed the poles" (p. 9). ' 

Although the formal monitorial school faded, from the educational scene 
around the mid-i9th centruy, there has been some continued interest in the 
use of students as tutors in the schools. A strand of research has developed 
that has not only tried to assess the impact of the procedure on the 
achievement of the tutor and the tutee, but also to understand better the 
dynamics at work in the tutoring session. It was the latter that was * 
addressed by this piece of research. • ' 

Great care must be taken in interpreting this research because of the 
small and selective number of students involved. This shortcoming is some- 
what mitigated by the fact that the study was an intensive examination of 
affective factors at play iri the tutor-tutee situation. Yet care should 
still be taken When generalizing from the data. 

Probably the main contribution of the study is the additional support 
it provides for Sarbiri's (1976) contention that the contribution of tutor 
to tutee may come from that person's role as friend that may be beneficial 
in the situation, arid riot the fact, that more teaching time is provided the 
student. This study also suggests that where there is a- sex difference, 
and greater age differentials between tutor and tutee, the role of tutor may 
take on more of a teaching character and less a give-arid-take friendship 
character. The friendship role appears to be optimized when the tutor and 
tutee are of the same sex and have little age differentiation. 

Another limiting factor in the study was the measures of student 
progress used in the remedial mathematics phase. It appeared that tutees 
increased their pace of going through instructional materials when working 
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with a tutor, and upheld the pace after tutoring was withdrawn. But nothing 
is mentioned of the quality of the learning taking place as students go 
through the instructional materials, it may be that faster is not necess- 
arily better for the slower students who served as tutees . 

Arid finally, the data suggested that, tutors were not particularly 
adept in the instructional process and were most effective for practicing 
or reviewing the tutees f skills. This tends to bring us back to the 
monitorial role played by older and more knowledgeable students in the 
Lancasteriari scheme of instruction 100 and 75 years ago. If there were 
nothing beyond this instructional role, then the monitoring might be .better 
done by a CRT attached to a microcomputer . But this study, along with 
others, suggests that there may be a more critical dynamic at play between 
tutor and tutee that makes a contribution to the tutee (and perhaps the 
tutor). Naisbitt (1982) has used the terms "high tech" and "high touch" 
when describing the need for counterbalancing human responses (high touch) 
with new technology (high tech) that is introduced into a society. This 
study provides an additional glimpse of high touch at play in an instruct- 
ional setting. 
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Lee, Kil S. FOURTH GRADERS* HEURISTIC PROBLEM SOLVING BEHAVIOR, . 
Journal for Research in Mathematics Education 13: 110-123; March 1982. 

Abstract arid comments prepared for I.M.E. by JACK E AS LEY, University 
of Illinois at Urb ana- Ch amp ai gn . 

This study assesses the results of teaching a group of eight 
fourth-grade children how to use four of Polya's heuristics on story 
problems, most of which involve combinatorics or p ropo r t ionali ty . 
Heuristics were demonstrated for five sessions and practiced for 
15 more, with one story problem to a session. When given similar 
sdory problems in individual interviews afterwards, the group instructed 
performed phenomenally better than a group of ^similar children who 
went to regular mathematics classes in school during the same time. 
Since nothing is said about what they did, we may presume that they 
did not study combinaooric or proportionality story problems, since 
these are rather unusual in fourth grade. * „ 

Another comparison was made between two sub-groups of the eight 
children who were taught in the unusual way. One group of four were 
average students who had also performed at level IIA on the Inhelder- 
Piaget pendulum and balance tasks and the other group were four above- 
average students who performed at level IIB on those two tasks . It 
was found that, on four of the six post- instruction story problems 
where multiplication was appropriate, all of the above-average , level 
IIB students used multiplication, and* two of them used multiplication 
on each of the other two problems where it was appropriate. (There 
were just two problems on which multiplication was inappropriate.) 
Among the four children who were judged of average ability and who 
performed at level IIA, only two multiplied once each — they added ; 
much more often. 

Abs tractor's Comments 

The research design confounds training in heuristics with training 
in working combinatoric and proportionality problem, and it confounds 
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the teacher's judgment of student's ability with the level oh Ihhelder- 
Piaget tasks. The author concludes the study with five "hypotheses 
based on the observed , results and the theoretical rationale of the 
study 11 . Two of them appear, because of the design, not to be based 
on the results and it is unclear what their theoretical basis is. 
They are: 11 1. Specific heuristics adapted from Polya can be effectively 
incorporated into the .probiem-solvinp, experience of fourth graders 
... Hypothesis 4: In a problem-solving situation where multiplication is 
appropriate, 1IA children use addition procedures primarily, and IIB 
children use multiplication as well as addition procedures." Possibly, 
by calling these and other conclusions "hypotheses," it is intended to 
protect them from criticism. 

There are also two pages devoted to summarizing the strategies 
children used oh particular problems in the interviews. Because the 
problems are unusual, the errors tend also to be unfamiliar. A more 
descriptive account of the thought processes of children during 
instruction, as well as during interviews, would have been helpful. 
This study is a mixed type, which leaves out much that would be expected 
from both experimental and clinical perspectives. Perhaps that is the 
fate of. mixed studies. 
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Stigler* James W.; Lee^ _Shin--Yingi_Lucker^ G^.__Wi41iam; and_St evens on , 
Harold W. CURRICULUM AND ACHIEVEMENT^IN _MATHEMA.TICS : A STUDY . OF 
ELEMENTARY SCHOOL CHILDREN IN. JAPAN/ TAIWAN, AND THE .UNITED STATES. 
Journal of Educational Psychology 74: 315-322; June 1982. 

Abstract and comments prepared for I.M.E. by MORRIS LAI, University 
of Hawaii. c 

1. Purpose 

In order to better understand cross-national differences in 
mathematics achievement that have been found at the secondary school 
levels, relationships among elementary school curricula and mathematics 
achievement at grades 1 and 5 in Japan* Taiwan, and the United States 
were investigated. 

2. Rationale 

Such cross-national investigations are seen as valuable for 
understanding the influence of social, cultural, and educational 
f ac tors "on students * learning . In order for me an^ngf ul in t e rp re t a t ions 
to be possible, differences in curricula must be taken Into account 
in the construction of tests to be used in the research. 

3. Research Design and Procedures — Par^-1 

First an analysis of the most recent, popular textbook series 
used at each of the sites [Sendai, Japan: New Mathematics (1978); 
Taipei, Taiwan: Public Elementary School Mathematics (1978); and 
Minneapolis: Mathematics Around Us (Scott-Foresman, 1978) ] was 
conducted. This analysis consisted of the construction of a list 
containing each concept and skill presented and the grade level and 
semester in which they were first introduced. 

4. Findings — Part I 

- Of the 320 topics .listed, 64% appeared in all three curricula, 
91% appeared in the Japanese series, 81% in the Taiwanese series, and 

e - 
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78% in the American series. Thuis Japanese textbooks expose children 
to more topics than do textbooks from Taiwan or the United States. 

In terms of school year and the introduction of concepts'/skills, 
Taiwan was behind the other two countries at the middle of grade one 
and the middle of grade five. The American curriculum kept pace with 
that of Japan through the first grade but was behind by the 0 middle of 
the fifth grade. Throughout the six years of the elementary school 
curricula, only 26 of the topics were: introduced during the same 
semester iri all three countries. 

The curricula analysis provided a base on which to build a 70- 

item achievement test designed for individual administration. Items 

.a 

were ordered according to the mean grade level at which the underlying 
concepts or skills were introduced. A combination of native and 
bilingual speakers was used to ensure comparability of items across 
the three language groups. - 

5. Research Design an d Procedures — Part II 

Random samples of children from 40 classrooms in 10 schools 
chosen to "represent a random sample of elementary schools 11 in each 
of the three locations were selected. Two boys and two girls were 
randomly selected from the upper, middle, and lower thirds of the 
distribution of reading scores obtained in each classroom, resulting in 
a total of 240 first graders and 240 fifth graders from each country. 
Sendai and Minneapolis were cited as comparable in size arid general 
economic and cultural status. Taipei was noted as comparable" in size. 

First graders started at Item 1 and continued until four successive 
items were missed. Fifth graders began with Item 35, which had a lower 
than fifth-grade level of difficulty, arid continued until four 
successive items were missed. If a fifth grader missed any of Items ; 
35-38, the child was taken back to a lower level item. 

The test showed high internal consistency, with Cronbach's alphas 
ranging from .93 to .95. 
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6. Findings ~ Part II 

Differences between boys and girls were not statistically significant 
at the .05 level. For both grade levels * children in the United States 
had significantly lower scores than did children in Taiwan and Japan, 
even when the topics were known to have been covered in the American 
textbooks. On both story problems and computational skills, Japanese 
children scored higher' than the children from Taiwan at grade 5 but not 
at grade 1. 



TABLE 4 

AVERAGE PEROFRMANCE FOR BOYS AND GIRLS IN THE THREE COUNTRIES 

Boys Girls 



: Country M SD M SD 

Grade 1 

Japan 20.7 5.7, 19.5 4.6 

Taiwan 21.2 5.4 21.1 5.6 

United States 16.6 5.5 17.6 5.2 

Grade 5 

Japan 53.0 7.5 53.5 7.5 

Taiwan 50.5 6.4 51.0 4.9 

United States 45.0 6.5 43.8 5.9 

It was found that more classroom time was devoted to mathematics 
instruction in both Japan arid Taiwan than in the United States. American 
first-grade students reported, by far, the least amount of mean number 
of minutes spent each week on homework: Sendai, 233 minutes; Taipei, 
496 minutes; Minneapolis, 79 minutes . The respective means at grade 5 
were 368,- 771 256. American parents also spent the least amount of 
time assisting their children in homework.. Average class sizes 
reported were: Taiwan, 47; Japan, 39; the United States, 21. 
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7. Interpretations 

The superior performance of Japanese children is partly related 
to Japan's advanced curriculum; however , factors other than curriculum 
appear to be critical in accounting for American children's lagging 
behind children from Taiwan and Japan, Factors suggested by the 
data include amount of instruction time, amount of homework, and 
amount of parental involvement. 

Abstractor* s Comments 

The authors have appropriately noted that differences in curricula 
must be taken into account in order to begin to understand cross- 
national differences in achievement. They further noted that methodo- 
logical limitations in cross-national studies often hinder interpretations 
Both assertions address issues critical to the study that they themselves 
have reported. 

An analysis of textbooks served as the main method of acquiring, an 
understanding of the content of the mathematics curriculum. Classroom 
observations to study the curricula were seen as too costly. As a 
result, the curriculum analysis did not include any significant 
details on actual implementation in the classroom. Such a shortcoming 
severely limited the study. 

Although many educators stereotype teachers as relying almost 
solely on the textbook to formulate lessons, we do not know if indeed 
this was the situation for the teachers at the three sites. Furthermore, 
by using a procedure in; which coders merely checked whether or hot a 
concept or skill was present, the study implicitly assumed equivalent 
quality in the writing of the textbooks, [The authors use the term 
"quality" in a different j sense — basically to mean "higher level" or 
"more advanced" ! (p. 317)]. Another problem, which the authors did 
indicate - 9 was the fact that, rib attempt was made to determine the relative 
[importance of concepts or skills in each curriculum. 

\ c ' . 
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These concerns, combined with the questionable representativeness 
of the three cities in the study, dictate that the study be treated 
as tentative, perhaps more so than did the authors. 

Some positive points in the method used include: 

a) random sampling with stratification by sex and achievement 
level (but by reading level rather than mathematics achievement) ; 

b) the strategy used to eliminate children with IQ f s below 70; and 

c) the manner in which the test was cons tructed^ based on the 
curriculum (i.e. , textbook) analysis. 

Intriguing but not fully reported are the parent interviews and 
classroom observations. It was reported, for example, that observa- 
tions showed that the American children spent less class time in mathema- 
tics; however, data on the children's on- and off-task behaviors were 
collected but not reported. Benjamin Bloom (1981) has stated that 
research from the international study reported by Husen' (1967) 
showed Japanese students having a substantially higher engagement rate 
(bri task) than did American students. Bloom asserted that the observed 
differences in engagement rate were large enough to fully account for 
the achievement differences. Given these earlier research findings, it 
would seem desirable to analyze the achievement differences in terms 
of time on task. 

Despite its methodological shortcomings , the study has provided 
evidence that there are notable differences in curricula as well as 
achievement among the sites studied in the three countries. In order to 
determine more precisely the reasons for these differences, it would 
be necessary to make the following improvements (albeit costly) in the 
design: 

a) a curriculum analysis that includes an investigation of the 
quality and emphasis of the various concepts or skills as written in 
the textbooks; 

b) an observational study of the implementation of the curriculum; 

c) a covariate measure of students 1 mathematical* ability; and 

d) a sampling from other geographical areas in the three countries 
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It would appear that the differential amount of time spent 
(including both instruction and homework) on mathematics would again 
turn out to be the leading candidate for "causing 11 the achievement 
differences. On the other hand, ail analysis of the quality of instruction 
[e.g., in terms of the use of active teaching behaviors that have been 
shown to be related to student achievement (Brophy, 1979)] may reveal 
other explanations for the differences in achievement. 

Finally, a major reason for conducting such cross-national studies 
is the ultimate improvement of instruction in each of the countries. 
Although there were differences in test scores, it would seem reasonable 
to expect that all of the countries could benefit by getting a better 
understanding of what was being done in the other countries as well 
as their own. 
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Stones i Ivan; Beckmann, Milton; and Stephens-, Larry. SEX-RELATED 
DIFFERENCES IN MATHEMATICAL COMPETENCIES OF PRE-CALCULUS COLLEGE _ 
STUDENTS . School Science and Mathematics 82: 295-299; April 1982. 

Abstract and comments' prepared for I.M.E. by DAIYO SAWADA, 
Univ3rsity of Alberta, Edmonton. 

1. Purpose 

"The purpose of this study is to, investigate sex differences 
in the achievement on mathematical competencies among college 
students in pre-calculus mathematics courses" (p. 295), 

2. Rationale 

Previous research relating to sex differences in mathematical 
competencies has been done at the high school level, indicating 
that boys achieve at a higher level than girls when mathematics 
course background is not considered, But that such differences 
disappear when course background is considered (Davis, 1950). 
On the other hand, .Rust (1964) found that consideration of course 
background did not eliminate higher male achievement. 

3s Research Design and Procedures 

The test used to measure mathematical competence was the 
Beckmann-Beal Mathematical Competencies Test for Enlightened 
Citizens which contains 48 items sub-categorized into 10 scales, 
with one item relating to each of the competencies identified in a 
1972 NCTM report (see Edwards, 1972). The test was administered 
to 1046 students who in the first semester of the 1976-77 school 
year were enrolled in 38 mathematics classes which could be cate- 
gorized as "College Algebra" or "Mathematics for Elementary Teachers" 
or "Applied Mathematics" at four state and six community colleges. 
Students were categorized into five strata according to mathematical 
background , .using a "Years of Math" variable (see Table 1). 
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Table 1 

Classification of Students Based on High 
School Mathematics Background 



Group Years of Math / Description of Mathematics Co urses 



1 0-5 : 7 Vocational Math,' General^Math, Business Math 

2 .5-1.0 f Minimal College Preparatory 

3 1.5-2.0 \ Average College Preparatory 

4 2.5-3.0 , Above Average College Preparatory 

5 3.5-5.0 Strong College Preparatory 

(p. 296) 



Sex differences were compared using "two-sample T-tests" 
(p. 296) for each of the ten subcategories and for the test as a 
whole for all of the students, and again for each of the five 

mathematical background strata. 

: ? / 
. . . I 



When mathematical background was not considered, males scored 
significantly higher (0.01 level) than females on three subcate- 
gories: Geometry, Measurement , and Probability and Statistics. 
Females scored ^/significantly higher (0.05 level) on one subcategory, 
Mathematical Reasoning. On the test as a whole, there, were no 
significant sex differences. 

When mathematical background was considered, "a clearer picture 
is obtained. 11 Table 2 gives the means for each sex for each 
background strata on each of the ten subcategories. 
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Table 2 



Sample Size 
Subcategory 

Numbers and Numerals 
Operations and Properties 
Mathematical Sentences 
Geometry 
Measurement 
Relations and functions 
Probability and Statistics 
Graphing 

Mathematical Reasoning 

Business aiid Consumer Math 
" AH Categories 

• P<05 ** P<.01 



rM 
28 



Means for Males and Females ofCosii'i-tfncy 

Sub-Category Scores for Different Math Backgrounds 



M 
71 



F 
54 



Math Background 

~3 

M F 
148 * . 119 



M 
180 



1" 

171 



M 
143 



(p. 298) 



F 

III 



4.36 


3:86 


4:76 


4.76 


5.26 


5.13 


5.76 


5.68 


6.08 


6.10 


4.96 


4.38 


5.39 


5.48 


6.18 


5.88 


6.68 * 


6.54 


6.88 


6.93 


1.54 


] .62 


1.80 


1.91 


2.13 


2.03 


2.41 


2.36' 


2:41 


2:59* 


2:96 


i.95** 


3.03 


2.81 


3.63 


3.46 


4.14 


5.9 !• 


4..U 


4.26 


3.14 


3.00 


3.30 


3.13 


3.78 


3:35«* 


4.02 


3.88 


4.22 


4.12 


1.64 


1.38 


1.79 


1.61 


1.96 


1.80 


2.22 


2.15 


2.42 


2.39 


1:82 


ij4* 


2.08 


1.69* 


2.17 


!.9i* 


2.44 


2.25 


2.59 


2.44 


2.21 


1:67 


2:65 


2.63 


2.95 


2.86 


3.3! 


3.22 


3.55 


3.50 


1.75 


1.67 


2.00 „ 


2.24 


2.00 


2.20* 


2.25 


2.34 


2.40 


2.34 


3.82 


_3.57 


: 3.94 


4.06 


4.61 


4.27* 


4.74 


4,76 


4.73 


4.87 


28.21 


24.24 


30.75 


30.31 


34.66 


32.89* 


37.97 


37.09 


39.68 


39.53 



On the 50 mathematics backgrbuixd-subcategory combinations , males 
scored significantly higher on 8 (16%) of the comparisons, while 
females scored sighf icantly higher on 2 (4%). "Therefore, when 
mathematics background is taken into consideration, there is no. 
real difference in mathematics competency due to sex in 80% of the 
background-subcategory combinations " (p. 297). 

5. interpretations 

"It is of interest to note" (p. 297) that at the subcategory 
level females did significantly better than males on Mathematical 
Sentences and Mathematical Reasoning, while males did significantly 
better on Geometry, Measurement, Probability and Statistics, and 
Business and Consumer Mathematics. There is a contrast in these 
subcategories: the ones in which males did better are ones "in 
which knowledge of specific course content was important" (p. 297), 
while the ones in which females did better are ones "in which the 



ERLC 



49 



45 



ability to reason mathematically was important, but* in which specific 
course content was not critical" (p; 297). In commenting further on 
the above conclusion, the authors close their report with this 
statement: 

/ 

These results tend to . reinforce the notion that there is 
actually no difference in mathematics ability due to sex, 
but in filling the role society has created for males and 
females that males may put more effort in mastering the 
traditional courses encountered in high school, (p. 2?9) 

Abstractor y s Comments 

At the outset, it must be noted that, because the Stones, 
Beckmann, and Stephens (1982) study is reported in such abbre- 
viated form, the fidelity between the report itself and the study 
it purports to represent is problematic to an uncomfortable 
degree. It would have been unnecessary to raise many of the 
issues below if the authors had taken (or were allowed) more 
space to report their study. In this review I have taken the 
opportunity to focus on the problems that are created and 
aggravated when a research report suffers a large credibility 
gap in representing the research actually done. 

In the paragraphs to follow, the various issues raised 
are issues that become problematic when the concerns listed 
below are neglected. 

..... **__ » 

1. Need for explicitness in a report as regards 

a. a rationale, 

b. a design, 

• c. a set of' focusing questions or hypotheses. 

2. Appropriateness of the analysis. 

3. The connection between data analysis and conclusions. 

4. Attention to critical aspects of methodology. 
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Need for Bxplicitness 

At first reading, this study seems deceptively -simple and 
straightforward:, the administration of an achievement test to 
570 male and 476 female college students categorized into five 
strata, of mathematical background* followed by 66 t> test's prasuaably- 
td determine whether mathematical background can account for the 
sex differences. The study looks to be a standard two-factor 
v quasi-experimental status study with sex as one factor, mathematical 
background as the other, and the Beckmann-Beal as the dependent 
variable. However, the siinpiicity begins to dissolve into complexity 
when the. one and only conclusion stated refers not at all to the 
difference in results obtained when mathematical background is ; 

considered and when it is not, thus raising doubts about the intended 

_ _ . t ■ — - 

purpose of the design of the study. In hopes of clarifying these 

doubts, I returned to the statement of the purpose of the study 

but found a statement so general as to be of no help at all. The 

purpose simply announces- that- sex differences in regard to mathematical 

competence will be "investigated". Further, since no* statement ■ 

of questions or hypotheses is provided, the intent of the study 

remains completely ambiguous. I next looked again at the introduction, 

hoping to find a rationale that might shed light on the explicit 

focus of the study. Again, there really is ho rationale; there is 

simply a brief review of three studies with no explicitly stated 

connection to the study. What connection there is is by implication: 

the three studies were done at the high school level; this study is ' 

done at the college leyel; apparently then* since the other studies 

focused pri mathematics background as the variable of concern^ so 

might this study. Thus, if any hypotheses were to have been stated 

they might have related to the expectation that, when mathematics 

background is considered, a considerable portion of the variance 

on the Beckmann-Beal due to sex would be accounted for. If this is 

so (and, of course, from the report there is no way of knowing if 

this is so), why did the authors not state any conclusions relative. 
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to this unstated hypothesis that was apparently tested using 66 
t-tests? Did they think the t-test analysis was inconclusive? 
To be fair, the authors did state in the "Results" section that 
"therefore, when mathematics background is taken into consideration, 
there is no real difference in mathematics competency due to sex 
in 80% of the background-subcategory combinations" (p. 297). 
However, they make no comment on the degree of difference in . 
mathematics competence due to sex when mathematics background is 
NOT taken into account, and, by omission, apparently leave it up to 
the reader. to draw the inference that, on, the basis of the analysis 
done when mathematical background was hot considered* there was 
mote difference in mathematical competence between males and females. 
Unfortunately, the analysis does hot support such an inference. , 
In fact, on the basis of the t-test analysis presented^, and with 
special reference : to the differential power of the t-test in the 
two settings, a. strong case can be made for the conclusion that 
consideration of mathematics ^ background does NOT diminish the differences 
between the sexes. 

What conclusion, if any, is to be drawn from the study? 
I suspect that the authors realized there were no strong. conclusions 
supported by the data, so that, when they came to write something 
in the "Conclusions" section, they chose to begin with the words 
"It is of interest to note" (p. 2970 • With this rather parenthetical . 
beginning, it is difficult to accept the conclusion (as indicated in 
the abstract) as anything more than an afterthought, a simple post 
hoc inference that, while interesting and insightful and perhaps 
serendipidous, is presumably not the intended "product of the design 
of the study. I believe it is encumbent upon the authors of any 
research report to state explicitly' in question form or other specific 
form just what the study intended to .find out. Without such a 
statement, and given the complete nonspecif icity -of the purpose, 
the design, and the rationale, I am led to assume that the authors 
have seized upon an apparent serendipidous finding and presented it 
as the conclusion. 
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Appropriateness of the Analysis 

Doing 66 t-tes ts is certainly a very gross way of analyzing data 
from a study, which in purpose is apparently similar to the three 
studies (particularly the Davis (1950) and Rust (1964) studies) cited 
in the introduction. That purpose was to assess the degree to 
which mathematics background* when taken into consideration, could 
account for the sex differences in mathematics competence. It 
would seem that a series of simple two-way analyses of variance on 
the subcategories of the Beckmann-Beai (or a multivariate two- 
way analysis of variance) would have been much more appropriate. 
But then, perhaps the purpose of the study was not similar to the 
studies reviewed in the introduction. 

Connection of Analysis with the Conclusions 

The authors close their report with a rather bold explanatory 
statement (quoted in the abstract) concerning what they believe 
..to be a significant aspect of taking mathematics courses in high 
school: "males may put more effort in mastering the traditional 
courses encountered in high school. 11 Now while this' may be a true 
statement* it by no means follows from the analysis of the data. 
There are no data in the study that suggest that it is. ,! ef f ort" 
that differentiates males from females in traditional mathematics 
courses in high school . There are even less data to support the 
conclusion that males put forth more effprt than females in high 
school mathematics courses. In fact, in recent studies (as, for 
example, the NAEP results released at the 1983 NCTM Annuax Meeting 
at Detroit), if females do do better than males* it is only at the 
Knowledge level as, opposed to higher levels (in direct contrast to 
the results of this study) suggesting that* to use the reasoning of 
the authors, females may put forth more effort in mathematics 
courses, effort that likely has to do with memorization as opposed to 
understanding. In summary, the statements made by the authors as 
conclusions might at best function as hypotheses or as topics for 
further study, but definitely not as conclusions of this study. 
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Methodology of the Study 

. There are a number of other shortcomings in the report, some 
due directly to its brevity, that raise doubts as to the credibility 
of the research. 

1. The methodology is primarily that of a survey done with 
cluster sampling on a stratified population. A carefully prepared 
sampling plan is key to the validity of a survey, but there is no 
indication whatsoever in the report that any aspect of survey 
methodology was followed of was of concern. For example * nothing 
is mentioned in regard to (a) how the 10 colleges were sampled from 
the colleges available (were there any others available?) * (b) 
were the 38 classes sampled the only ones eligible in the 10 colleges? 
(c) were all students in a given class tested or were some excluded 
and if so on what basis? (d) who administered the tests (the 
researcher, the classroom teacher, the principal)? (e) does the 
total number of students tested (1046) represent 100% return or is 
it closer to 50^, and so on. As is often the case in educational 
surveys, the study is dominated by a conception of quasi-experimental 
design, when the, more critical paradigm is that of a survey. For 
'example, in the study, diverse populations were combined (pfeservice 
elementary school tea chers, presumably mostly female, lumped 
•together with students taking applied mathematics courses presumably 
at community colleges, presumably mostly males) to get a single 
population that would contain approximately equal numbers of both 
sexes. . Question: are the females and males comparable in this 
amalgamated population, or might it be that the females are pre- 
dominantly teachers- to-be and might represent a slice of the female 
population which is academically more talented than the population 
represented by the males in the study? This is a very serious 
sampling issue. "Again, there is insufficient information in the 
report to assess adequately issues such as this. However, it is 
interesting to note that the major conclusion of the study could 
be substantially explained in terms of the alternate hypothesis that 
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the females are actually from an academically more talented population 
than are the males in the study. If so (and there is nothing in the 
report to suggest otherwise), then the males would represent the 
less able student who would find it more difficult to see the wider 
significance of the mathematical content that he is learning, and 
thus hot do so well on general mathematics reasoning items. The 
female, being representative of the more able student, would perhaps 
not pay any more attention to the content than the less able male, 
but would possess general strategies and superior reasoning skills 
which would serve well on items that are less content-specific. 
WMile hypotheses such as these are purely speculative, the point in 
raising them here is to stress the importance of an adequately 
described sampling plan. What description is provided does little to 
dispell the possible validity of such speculation. 

' 2. No psychometric specifications are provided for the 
mathematical competencies test (Beckmann-Beal) . 

There are some less important matters of reporting that are 
bothersome but not critical* such as the use of "T-test" as opposed 
to "t-test", particularly in a situation in which there are multiple 
dependent variables which could. easily and perhaps more appropriately 
be compared using a multivariate analysis, in which case an uppercase 
T is involved in the notation referring to the test derived by 
Hotelling. Also, the use of "subcategory 11 and "sub-category 11 in 
different places in the report should have been picked up (both 
versions are also used in the abstract in an attempt to remain true 
to the original). There are other trivial inconsistencies hot worth 
mentioning, but such trivialities tend to be taken as indicative of the 
care with which the study was done when larger issues of critical 
import are problematic. 

Closing Remarks 

I have taken a "devil's advocate" role in reviewing this study. 
This was not my original intention. But the deeper i got into the study 
the deeper I became embroiled in the ambiguities that compounded 



themselves each time I returned to the report to seek further clari- 
fication of an issue: the report was at times so vagtie , so incomplete, 
so amorphous y so nonspecific, and so unconnected, that not only did it 
prevent resolution of problems of interpretation, it also prevented 
any clear identification of the study from ever emerging. Every 
time I tried to express an issue of concern, I had to make such 
copious use of terms such as "apparently", "presumably 11 , "if this is 
so", arid so on so that it was difficult ever to make an unconditional 
statement. However, I would like to end this review on a optimistic 
note. The authors have identified a significant perspective in their 
conclusion which when generalized suggests that it may be more 
productive to view the role of mathematics background hot as a control 
or context variable, but as a process variable: it is the differential 
way in which the sexes "take" mathematics courses, and student effort " 
may be a part of this, which is significant (as well as the amount 
of such coursework) . More recent research (more recent in the 
sense of being done later, but perhaps not reported later, such 
as Becker (1981)), has f ocussed on coursework as a process variable 
with important findings. The significance of the present study 
lies in supporting the validity of the perspective of studies such 
as Becker 1 s. 
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Abstract and comments prepared for I.M.E. by WERNER LIEDTKE, 
University of Victoria , Canada. 

JL.. Purpose 

The study was designed to create a cognitive process model for 
the retrieval of selected basic subtraction facts by young children 
during their first three years in school. 

2. Rationale 

Process models consider two main categories of cognitive processes 
in- reaching the answer of a simple problem. When an answer is 
directly retrieved from storage in long-term memory (LTM) , the process 
is labelled reprodua tvve . When conscious derivations or manipulations 
in working memory are required to reach an answer , the process is 
labelled reconstructive . Examinations of existing process models 
led the investigators to conclude that "relatively little interest 
has been shown retrieval processes in simple arithmetics. 11 To 
study thie development of cognitive skills, the topic of subtraction 
facts was chosen. 

Since neither the type of memory nor the counting procedures (up 
or down) can be revealed in regression analyses of latencies, verbal 
reports and behaviors were analyzed. 

3. Research Design arid Procedures 

A slide projector was used to present 66 subtraction facts of the 
form M - N where M < 13, (M - N) > 1, N ^ 0, and N ^ 1. 

Sixteen subjects from two classes* ranging in age from 7.2 
to 8.1, were tested individually in the spring of the first school 
- year. During this testing, only reaction time measures were obtained'. 
The subjects were retested in the middle of each semester in grade 2. 

. • 57 
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By tne time the retesting took place in the middle of each semester 
in grjade 3, twelve subj ects made up the sample. The data collected 
about each subject in grades 2 and 3 included reaction time measures and 
use of memory aids (finger counting) , as well as verbal explanations 
about solution procedures . The counting- procedures and the verbal 
responses were categorized, frequency distributions were constructed, 
and changes in cognitive strategies over the test periods in grades 
2 and 3 were noted. 

4. Endings - 

To 1 accommodate the solution strategies , fourteen categories were 
created.! The main two categories for the veaonstruciive cognitive 
processed involved either counting up (U) or counting down (D) . The 
U-categories for (M - N) included: counting orally by one from 

(N + 1) to M and using the number of counts as the answer; using fingers 

j- , ' _ 

to count (by one from (N + 1) to M and reading the answer from the fingers 

and counting up from N in steps greater than by one and keeping 

track of the increments. The D-categories for M - N included: 

orally counting down and decreasing M by one, N times; orally 

counting down from M by one until N is reached and using the number 

- i . - 

of counts as the answer; orally counting down in steps greater than 

by one; and keeping track of counting N steps down from M with fingers 

which record the count as well as show the difference. 

Other ] categories included: counting up all numbers on fingers — 
i.e., M is 'counted, N is counted, N is taken away* the remaining 
fingers are j counted; for M < 11, M is represented without counting, 
N is taken away and the remainder is recognized as the answer; additions 
with equal Addends (doubles) are used to find the solution for 
M - N; and substitution of a simpler problem, i.e., 11-4= 10 - 3. 

Then there were the categories of: no description of solution; 
unsolved problems and of course the immediate recall response or 
reproductive solution (LTM) . 
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The major observations abstracted from a table showing the 
distribution of solutions over different strategies for all subjects 
and all years include the following: 

- the proportion of retrieved LTM answers is a little lower 
than one third of all the answers. 

- the use of fingers as external memory aid is frequent (36%). 

- common strategies include: counting down without use of 
fingers (10%), counting down on fingers (15%), and unsolved 
problems (10%) . 

- all subjects used different strategies. No one used" less than 
9 of the 12 different ways of solving the problems. 

Line graphs are drawn to show the increase of responses over 

i 

the testing period in the LTM and the counting up categories. 

Decreases in the categories involving the use of fingers and in the 

no answer category are also shown. 

Arrow diagrams and calculated proportions support the following 

conclusions: 

- the use of strategies involving the use of fingers and responses 
in the no answer category decreased. 

- increases exist for LTM solutions and strategies involving 
M counting- up . 11 

- initially the most frequent strategies were LTM, "no answer", 
and those involving the use of finders. 

- the probability was high that the same strategy would be used 
for the same problem during following test sessions. This 

is especially true for LTM solutions. 

- changes in strategies from "counting down" to LTM were observed. 

- at the final testing session, about two- thirds of the responses 
fell into LTM and into the counting down categories. 

the major trend for changes in strategies is one from the 
lower level to a higher level in memory use. 
about one-fourth of all solutions during the last testing 
utilized external memory aid. 



- the counting down on fingers strategy is quite common and 
constant (27% of all solutions in grade 2 and 15% in grade 
3). This is almost the only strategy which is quite often 
preceded by the more advanced LTM strategy. It seems to _ 
be the intermediate in the evolution from external memory 
and strategies to LTM solutions. 

5. Interpretations 

The study has shown a gradual evolution of children's strategies 
which begin with no attempt to solve and then include stages that 
involve: representing all numbers on fingers (external memories); \ 
representing only the minuend with fingers; settings where working 
memory replaces external memories; retrieving the answer from long- 
term memory. At the time of the final testing, many more problems 
than a teacher, would like were solved with the aid of fingers. 

Abstractor ? s Comments 

It is refreshing to read a study that involved the same subjects 
over a three-year period. One can only surmise that the investigators 
must have in their possession an abundance of interesting data. 

As the report was .re^d and summarized, the following questions 
and comments came to mind: 

(a) How were the subjects sampled from the two classes? Why- 
were these subjects chosen? Were both classes in the same 
school? Were they taught mathematics by the same teacher? 

(b) Why were examples of the type M - 0 excluded from the 
investigation? 

(c) Why was the spring of the first school year chosen for the 
initial testing? How long had the subjects been in school? 
What topics in mathematics had been taught prior to the 
testing? 
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(d) Verbal responses were collected from each subject. What 
sepcific questions were asked? How were the responses 
recorded? Were they taped or filmed? Were they coded by 
the investigators? (Did the authors of the report collect 
the data themselves?) 

-j _ _ •• ; 

(e) The verbal responses were classified into solution categories. 
Were any measures of reliability for this procedure calculated.? 
Are any data on inter-rater agreement available? 

(f) The point is made that during the initial testing session 

in grade 1 no verbal data were collected, only reaction, time 
measures. Yet some of the figures, especially Figure 5, 
show a classification of responses from this setting which 
would seem to be impossible to obtain without verbal comments 
from the subjects. 

(g) The numerousness of the solution categories was attributed 
to the young age of the subjects. Couldn't the assumption 
be made that the variety of strategies increases as new 
mathematical skills and ideas are learned? Beattie (1979) 
identified just as many different strategies for fifth 
and sixth graders as the authors of this report. 

(h) The report includes the observation that "it is interesting 
to observe that all subjects use many different strategies. 
... no one reports the use of less than 9 of the total of 
twelve ..." Which of these strategies are directly 
attributable to the curriculum objectives, the pupils' 
materials, the school, or the teacher? Which of the strategies 
are> a direct result of teaching? Wliich of the strategies 

seem to be developed by the subjects? What might some 
possible reasons be for this "self-development" of strategies? 
Gould this in any way be related to some special personality 
characteristics or some special behavior patterns? 

(i) The increase in percent for LTM solutions, the decrease in 
percent for solutions obtained by counting on fingers, and the 
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decrease of responses in the no answer category could be 
directly related to the objectives for any mathematics 
program for the early grades. How could the increase in 
response for the "counting up" category be explained? 
Could it be that some subjects were taught a counting up 
procedure for finding missing addends (a + □ = b) and then 
used this procedure for solving subtraction facts? 
(j) The authors claim to have shown a gradual evolution of 

children's strategies for solving subtraction facts. No 
attempt is made to identify which part of this evolution- 
ary sequence is directly related to the teaching these 
subjects have been exposed to. A teaching sequence for 
subtraction (basic facts) usually involves the following 
phrases : 

- - identification of subtractive 
action from experience 

- introduction of symbol 

- use of concrete materials to 
simulate the action 

2. Thinking Strategies - properties, patterns 

i- relationships among facts 
Lvities - practice, problems, games. 



i. Understanding 



Which thinking strategies were these subjects taught 
during phase 2 of the above sequence? Rather than having 
discovered the gradual evolution of children's strategies , 
could it be that the investigators identified the teaching 
sequence for subtraction (facts)? (Perhaps the author of the 
mathematics program considered a "gradual evolution" 
similar to the one identified by the authors as the program 
was prepared?) 
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(k) The point is made that "many more problems than a teacher 
would like, were solved with the aid of the children's 
fingers as late as in the last term of the third school 
" ; year." At what age/grade level are these children expected 

to recall these facts with "reasonable speed and accuracy"? 

In general one is left with the feeling that the authors did 
not go far enough in the discussion part of their report. No 
implications for educational settings are stated. Some information is 
missing and this would make the task of replicating the study a 
difficult task indeed. However , the results. of the study can be used 
to generate some interesting research questions; 
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